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7.3A A HIGH-SPEED DIGITAL SIGNAL PROCESSOR FOR ATMOSPHERIC RADAR 
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Tycho Technology, Inc. 
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Boulder, CO 80306 

GENERAL OVERVIEW 

The Tycho Technology Model SP-320 is a high-speed pipelined digital signal 
processing system designed around the capabilities of the Texas Instruments 
TMS32010 16/32-Bit Signal Processor. This device is a monolithic realization 
of a complex general purpose signal processor, incorporating such features as a 
32-bit ALU, a 16-bit x 16-bit combinatorial multiplier, and a 16-bit barrel 
shifter (TEXAS INSTRUMENTS, 1983a). The SP-320 is designed to operate as a slave 
processor to a host general purpose computer in applications such as coherent 
integration of a radar return signal in multiple ranges, or dedicated FFT pro- 
cessing. To the main PC board may be added piggyback modules for A/D conver- 
sion and I/O interfacing (see Figure 1), Presently available is an I/O module 
conforming to the Intel Multichannel interface standard (INTEL CORPORATION, 
1983); other I/O modules will be designed to meet specific user requirements. 

The main processor board (exclusive of A/D and I/O modules) includes input 
and output FIFO (First In First Out) memories, both with depths of 4096 VI, to 
permit asynchronous operation between the source of data and the host computer. 
This design permits burst data rates in excess of 5 MW/s. 

(a) Areas of Application 

The SP-320 was initially designed as a coherent integrator for atmospheric 
radar systans. In the course of development, it became apparent that with the 
addition of a few hardware features, the board could be made useful for a much 
broader class of mathematical and signal processing problems. 

Coherent Integration for Radar . Design criteria for this application in- 
cluded a 1-MEz sample rate, 12 bits raw data precision, and 256 range gates. 

The SP-320 has a digital input data path width of 13 bits, and will support 
burst data rates to 5 MHz. In practice, the sampling rate is limited by the 
A/D conversion time required for the desired precision. For example, 12 bits 
in 0.5 ps is about the limit for a cost-no-object system using a single board- 
level converter per analog channel; a reasonable compromise is 10 bits in 1.0 
ps using a hybrid A/D converter. Assuming range accumulators of 32 bits, 64 
ranges can be accommodated using only the internal data memory of the TMS32010. 
Using external data memory, a maximum of 2048 ranges can be used. However, ex- 
ternal range accumulators require approximately 3.6 ps for the read-add-write 
sequence compared to 1.4 ps for internal accumulators. Thus, for a large num- 
ber of ranges, the system may become computation limited, requiring a reduction 
in pulse repetition frequency. Also, for a large number of ranges used in con- 
junction with a small number of points per integration, the system may become 
output limited at the processor-host interface. For a discussion of these and 
other performance trade-offs see (TYCHO TECHNOLOGY, INC., 1984). For the 
realistic case of a 1-MKz sample rate, 256 ranges, 6 parallel data channels, and 
a host interface capable of a DMA transfer rate of 1 W per ps, the SP-320 can 
support a pulse repetition period of 1 ms with any number of integration points 
greater than 8. 

A feature of the TMS32010 that is of particular interest in the context 
of coherent integration is the MPYK (multiply by constant) instruction. This 
allows the raw data word to be multiplied by a 16-bit constant previously stored 
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Figure 1. SP-320 overall block diagram. 
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in the T register with no execution time penalty compared to simply entering 
the raw data into the processor. The user thus may impose a window function on 
the integration by periodically reloading the T register from a table in pro- 
gram memory. 

For those users who wish to implement a pre-integration pulse-pair pro- 
cessing algorithm, the SP-320 provides an Inter-Channel Link. This allows the 
processors of two associated quadrature channels to pass data words back and 
forth as required for complex arithmetic. * 

t 

Digital Filters . The TMS32010 was designed with a strong emphasis on 
the efficient execution of digital filter algorithms. In particular, hardware 
macro instructions such as LTD allow very fast manipulation of running lists of 
data points. See TEXAS INSTRUMENTS (1983b), section 8, and RABINER and GOLD 
(1975) for more details on digital filter design. The SP-320 can be used as a 
fast real-time digital filter by employing the TMS32010, using only internal 
data memory, and the input and output FIFOs. 

FFT Processing . The SP-320 supports both real-time and batch FFT pro- 
cessing. Maximum conversion efficiency is achieved by using straight-line code 
and a maximum of 64 complex points of 16 bit integer precision. See Figure 5 
in MAGAR et al. (1982) for a summary of conversion times for various si zes of 
transforms and for both the straight-line and looped code cases. For example, 
a 64 point complex transform with straight-line code can be completed in 738 
us. A 1024-point transform with looped code requires 76 ms. Within the 64- 
point limit, the TMS32010 does not need to access external data memory, and with 
straight-line code can achieve conversion times that compare favorably with some 
dedicated FFT processors. Even with the much longer conversion times required 
by larger transforms and looped code (required because of the 4 kW program 
memory size limitation), the SP-320 may in some circumstances be a useful com- 
promise between a hardware FFT processor and a software FFT running on a general 
purpose microprocessor. 

HARDWARE DESCRIPTION 

The SP-320 consists of a main PC board with external connections grouped 
on two headers. These headers mate with connectors on the two piggyback mod- 
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ules, the 320-AD (A/D Conversion Ilodule), and the 320-MC (Multichannel Bus 
Interface). For those users who wish to design custom interfacing for the SP- 
320, the two headers allow direct access to the input and output FIFOs and to 
the program and data memories. 

(a) Main Board 

The TMS32010 is a "Harvard Architecture" device, i.e. the program and data 
memories are separate. The SP-320 implements the full 4k W addressable program 
memory space. The TMS32010 has 144 W of internal data memory and only 3 b of 
external data address. To make the SP-320 a general purpose processor, the 8 
externally addressable data objects are regarded as I/O ports, and 4k W (or 2k 
DPW) of data memory are furnished, with auto-incrementing (or decrementing) 
address generation under control of the ports (see section entitled "Port 
Assignments") . 

Raw Data Input . Normal data input for real-time applications is by way 
of the input FIFO through header J1 using a Data Valid/Read control sequence. 
This input path is 12 b wide; each data word enters the TMS32010 in one in- 
struction cycle (200 ns) by use of the MPYK instruction. A hardware switching 
scheme injects the current output word of the input FIFO into the constant field 
of the instruction. This feature may be enabled/disabled under software con- 
trol to allow normal use of the MPYK instruction. Jumper settings allow the 
raw data word to be interpreted as either natural binary or two's complement 
binary. In the latter case, the data word is automatically sign-extended in 
the TMS32010 to' 16 bits. 

Data Memory . The SP-320 data memory is organized as 4096 16-bit words. 
However, odd and even addresses are accessed by separate I/O ports. This ar- 
rangement was determined to be optimum for implementing double precision ac- 
cumulators for coherent integration. Address generation is external to the 
TMS32010 and address incrementing, under control of 4 bits in an increment con- 
trol register, can be made to occur automatically after any or all of the fol- 
lowing: read low word, read high word, write low word, write high word. For 

example, in coherent integration for radar, using double precision accumulators, 
one would set the increment control bit for "increment after write high word". 
Then, after the sequence: read low word, add, write low word, read high word, 

add, write high word; the data memory automatically would be incremented for 
the next range gate. Address, data, and control lines for block transfers to 
and from data memory are available at header J2. 

Output FIFO . For pipelined processing, the SP-320 provides a 16 b wide 

by 4096 W deep output FIFO. The width (16 b as opposed to e.g. 32 b) was de- 

termined by the standard interface definitions (Multichannel, IEEE-488, etc.) 
in general use for scientific applications. For an application such as inte- 
gration with double precision accumulators, one would program the system to 
write the results of the last sequence of additions to the output FIFO in low, 
high order instead of returning the results to data memory. The host would 
then be expected to transfer out the range sums at an average rate sufficient to 

keep up with the processor. For batch operations, the output FIFO may still be 

used as the output path, as an alternative to a DMA transfer out of data memory. 
FIFO output is available at header J2. 

Program Memory . The TMS32010 executes instructions from a 16 b by 4096 
W RAM. which must be loaded by the host computer with TMS32010 object code. As- 
sertion of the Reset line places the TMS32010 in an inactive state and makes 
the program memory address and data lines available at header J2 for block load- 
ing by the host. Upon deassertion of Reset, the TMS32010 begins execution at 
address 0000. 
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Figure 2. Format for loading data address counter. 


(b) A/D Converter Module 

The 320-AD piggyback module (see Figure 3) mates with J1 on the SP-320. 

Its main components are a Teledyne Phi lbrick 4860 track-hold amplifier and a 
Burr-Brown ADC803 analog-to-digital converter. Input voltage ranges of ±10, 

±5, and 0 to -10 are jumper selectable. The converter's clock rate is adjust- 
able, and the end-of-conversion point is jumper selectable; these features al- 
low the module to be set up for maximum conversion speed at any precision from 8 
bits to 12 bits. The output format can be jumper selected to offset binary or 
two's complement binary. An output latch holds the result of the last conver- 
sion while the current conversion is in process, allowing continuous pipelined 
operation at the maximum speed of the converter. The output latch is 16 bits 
wide with full sign extension, for maximum versatility in interfacing the module 
to devices other than the SP-320. The maximum continuous sampling rate varies 
from 2 MHz at a precision of 8 bits to 0.66 MHz at 12 bits (BURR-BROWN, COR- 
PORATION, 1983). 

(c) Multichannel™ Bus Interface Module 

The 320-MC piggyback module mates with J2 on the SP-320. This module 
serves as a high-speed parallel interface between the SP-320 and a Multichannel 
bus at the Basic Talker/Listener level of compliance (see INTEL CORPORATION, 
1983, section IV). The module includes counters for address generation and 
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Figure 3. Block diagram of 320-AD module. 
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word count to support high-speed DMA transfers. Up to 15 modules may be con- 
nected to the same bus under the control of a single DMA device having Multi- 
channel™ supervisor capabilities. All of the functions available at J2 are 
supported by the 320-MC as register and memory assignments. The Multichannel 
Device Number for the module is jumper selectable. 


PROGRAMMING CONSIDERATION 

The following hardware details have a bearing on the programming of the 
SP-320; they should be considered in conjunction with the TMS32010 instruction 
set (TEXAS INSTRUMENTS, 1983a). 


(a) Port Assignments 


The 8 1/0 ports of the THS32010 are decoded separately for read and write, 
and perform the following functions: 

Port Read Function Write Function . 


0 

1 

2 

3 

4 

5 

6 
7 


not used 

read LO data word 

read HI data word 

read inter-channel link 

read system status byte 

not used 

not used 

not used 


load data address counter* 
write LO data word 
write HI data word 
write inter-channel link 
write flag byte 
write to output FIFO 
not used 
not used 


*See Figure 2 for data address counter loading format. 

(b) Timing and Interrupts , 

The. SP-320 requires that an external 40-MHz TTL level clock signal be sup- 
plied to J3 (an SMA female connector on the main PC board). Use of an external 
clock allows the synchronization of multiple data channels executing the same 
program. 

Two provisions have been made to enable the system to avoid over-reading 
the input FIFO. The empty line of the FIFO controller is routed to the INT 
pin, and the Data Ready line to the BIO pin of the TMS32010 (see TEXAS INSTRU- 
MENTS, 1983). Thus the user has the option of either checking BIO status be- 
fore each data input, or using an interrupt handler to respond to the empty 
condition. 

The flag byte (I/O port 4) has 4 user-definable bits that control on-board 
LEDs for diagnostic purposes. These bits are also readable at J2. The user 
could, for example, define one of these bits as a "task completed" indicator 
for batch processing and use it to initiate a host interrupt. 
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